perm filename MODPRO.PRA[W77,JMC] blob
sn#261996 filedate 1977-02-06 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00009 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .XGP
C00003 00003 A PROOF-CHECKER FOR DYNAMIC LOGIC
C00004 00004 INTRODUCTION
C00013 00005 A CONVENTIONAL AXIOM SYSTEM
C00025 00006 AN UNCONVENTIONAL AXIOM SYSTEM
C00046 00007 AXIOMS FOR PROGRAMS WITH LOOPS
C00050 00008 DESIGN OF A PROOF-CHECKER
C00058 00009 BIBLIOGRAPHY
C00065 ENDMK
C⊗;
.XGP
.VSP 20
.SINCH 8.5,6.5,11,9,1
.FONT 0 YVONNE;MF1 KST
.ADJUST
.SPRETA
.DUMMY ←
.SINGLE
.CE
A PROOF-CHECKER FOR DYNAMIC LOGIC
ABSTRACT
A rule of inference for a modal logic of programs is
described. It is proposed to implement a proof-checker based on this
rule.
INTRODUCTION
We begin by describing a logical language for talking about
programs. This language, introduced in [13], combines for the first
time the succinctness of first-order logic with a useful amount of the
flexibility of second-order logic in describing the behavior of
programs that manipulate their environment. This is accomplished by
generalizing the notion of first-order quantifier to a wider class of
adverbial constructs of the form "after executing b ...". Because
these "quantifiers" can be alternated just as conventional quantifiers
can be alternated in first order logic, it is possible to express not
only partial correctness and termination concepts but such facts as
"if you set Y to X+5 then it is possible by repeatedly executing
Y:=Y-1 to make Y=X" or "If P holds then after executing b one of the
(possibly many) ways of executing b backwards will make P hold once
again." Neither of these two self-evident facts can be stated in
Hoare's partial correctness formalism or in, say, the total
correctness formalism of Manna and Waldinger [9], though they can be
stated in any formalism that permits explicit manipulation of states
and/or programs as individuals, as in [2] where states can be
quantified over, or [4] where programs are terms.
Before supplying a sense in which quantifiers can be
generalized to deal with programs, let us construct a framework in
which we can conduct our metamathematics. Except for our treatment of
quantifiers as modalities, the following should be recognized as being
essentially standard first-order logic; we go into detail only to
avoid notational confusion in the sequel. The central concept here is
the →symbol_; we envisage four kinds of symbols, →function symbols_,
→predicate symbols_, →logical connectives_, and →modalities_, of
various arities ≥ 0, with the condition that modalities be unary. The
four concepts dependent on the concept of symbol are language, state,
universe, and expression. A →language_ is a set of symbols. A
→state_ (interpretation, world, environment; interpretation is the
preferred term in logic, but we shall use state for brevity and
because we wish to think of machine states as no more than
interpretations) of a language specifies a →domain_ D (carrier,
underlying set) and assigns a →value_ to every symbol in the language
except the modalities; it assigns k-ary functions on D to k-ary
function symbols, k-ary predicates on D to k-ary predicate symbols,
and k-ary boolean functions to k-ary logical connectives. A
→universe_ (Kripke structure [7]) specifies a set U of states having a
common domain D, and assigns a value to every modality in the
language, namely a binary relation on U; we shall frequently write [a]
to denote a modality whose binary relation is a, and abbreviate its
dual ¬[a]¬ to <a>. (The notation is motivated by the box and diamond
notations of modal logic [6]; in conversation we have found it convenient
to pronounce [a] as "box a" and <a> as "diamond a.") Symbol X is
→constant_ in U when X has the same value in every state of U. In the
sequel we assume that logical connectives and modalities are constant.
Symbol X is →uninterpreted_ in U when for every state # in U and every
possible value V of X on D there exists a state #' in U differing from
# only the value of X, which in #' is V. An →expression_ over the
language L is a pair having as components a k-ary symbol of L (its
→operator_) and a k-tuple of expressions over L (its →operand_, whose
components we call the →arguments_ of the operator). (The presence of
0-ary symbols in the language guarantees the existence of finite
expressions; in the sequel we exclude infinite expressions.) In this
paper we write non-modal expressions as F(a,b,...) (or aFb for such F
as +,=,>,⊃∧, etc), F() as F, and modal formulae as [a]P or <a>P.
Expressions are classified according to their operators as either
→terms_ (function symbols) or →formulae_ (the rest). Formulae with
predicate-symbol operators are called →atomic_ while formulae with
modal operators are called modal. The arguments of function and
predicate symbols must be terms; all other arguments must be formulae.
Like symbols, expressions are assigned values by states; expressions
other than modal formulae are assigned values by the standard Tarskian
method (the value of an expression in # is the result of applying the
value of the operator in # to the values of the arguments in #). The
value of the modal formula [a]P in # is true when P is true in every $
satisfying #a$, and false otherwise. It follows that the value of
<a>P in # is true when P is true in some $ satisfying #a$, and false
otherwise.
In the above framework we can readily define the notion of a
first-order quantifier. Let X be an uninterpreted zeroary function
symbol and let X:=! denote that binary relation on U that relates
pairs of states differing at most in their value of X. (This
is an equivalence relation. Programming readers may consider it as
the non-deterministic assignment statement that assigns an arbitrary
element of the domain D to X.) Then it should be evident that we may
take ∀X to be the modality [X:=!] and ∃X to be <X:=!>.
.BLOCK 5
A CONVENTIONAL AXIOM SYSTEM
Let us now exhibit a sound complete effective axiom system for
first-order logic. (By effective we mean one whose axioms form a
recursive set and whose inference rules are recursive relations on
formulae.) The theorems of the following system are exactly the
valid formulae, those that are true in every state of every universe
in which the logical connectives have their standard values. The
system is conventional inasmuch as the rules and axioms are stated
fairly conventionally, in contrast to the rule of the section headed
"An unconventional axiom system." A novelty of the present system is
that it separates into logical and non-logical components what are
usually taken to be entirely logical rules and axioms, on the
principle that facts about X:=! are program-specific facts.
→Logical Axioms_
All tautologies of Propositional Calculus.
[a](P⊃⊃Q) ⊃⊃ ([a]P ⊃⊃ [a]Q) . (Axiom M)
→Logical Inference Rules_
P, P⊃⊃Q ⊃
Q . (Modus Ponens)
P ⊃
[a]P (subsumes generalization, P ⊃
∀xP ).
→Non-logical Axioms_
∀XP ⊃⊃ P↓XλT↔ (any term T; P↓XλT↔ is P with T substituted for X)
P ⊃⊃ ∀XP unless X occurs free in PP
The second logical axiom (M) above can be thought of as a
claim about programs; it says that for all states #, if P⊃⊃Q holds for
all states $ satisfying #a$ and if P holds for all states $ satisfying
#a$ then Q holds for all states $ satisfying #a$. The second
inference rule (really the rule of necessitation of modal logic) can
be considered as an upper bound on the power of programs, which cannot
falsify theorems; if P is a theorem then it is true after executing a.
In our system it is straightforward to prove as theorems the axioms
of, say, Mendelson's system K [10] (p. 57), and it should be clear that
the second rule subsumes the rule of generalization; in fact, if the
only modalities allowed are those with values of the form X:=! then
the rule of necessitation →is_ the rule of generalization, and the
theorems of this system coincide with the theorems of K.
What makes this language of interest to us is its potential
for expressing procedural facts. Let us look beyond X:=! to a wider
class of programs. Briefly, we shall consider the class of programs
containing (in addition to X:=!) tests and assignments, and closed
under the operations of union, composition and reflexive transitive
closure. Though this class may seem small, it manages to include all
the programs that can be written using constructs available in, say,
ALGOL 60 minus procedures, block structure, designational expressions
and ALGOL's treatment of booleans.
A →test_ is a restriction of the identity relation on U to
some subset V of U, the →domain_ of the test. Thus a test acts like a
non-destructive gate, allowing control to proceed just when the
machine state is a state in V. We shall consider only tests of the
form P? where P is a formula of the language. The domain of P? is just
the set of models of P (states satisfying P); for example X>0? permits
control to proceed just when X is positive.
A →first-order assignment_ (so-called because it assigns only
individuals, not entire functions, even when assigning to array
elements) is a function on U (a binary relation b such that each # in
U determines a unique $ satisfying #b$) of the form
⊃λ#.⊃λA.if A=λ/F then A↓#↔
else ⊃λuλ}.if uλ}=λ/sλ}(#) then F↓#↔(uλ})
else t(#).
This function when applied to # yields a state in which all
symbols save k-ary F (for some choice of F determined by the
assignment statement) remain unchanged, while F itself is changed at
only one value of its argument, that value being determined by sλ},
which is a function from states to k-tuples of individuals of D; at
that point the new value of F is given by t, which is a function from
states to individuals of D. This is array assignment; the array is F,
the selected point is sλ}(#) and the value assigned is t(#). We shall
consider only assignments of the form F(Sλ}):=T where F(Sλ}) (Sλ} a k-tuple
of expressions) and T are terms over the language. In this case sλ}(#)
is given by the value of Sλ} in # and t(#) by the value of T in #. When
k=0 we have simple variable assignment, F:=T.
The four operations on binary relations that we are concerned
about are given as usual by the smallest (with respect to set
inclusion) solutions of the following equivalences.
#(a∪b)$ ≡ #a$ ∨ #b$ Union
#(a⊃↔b)K ≡ ∃$(#a$ ⊃∧ $bK) Composition
#(a⊃π)$ ≡ $a# Converse
#(a*)$ ≡ #=$ ∨ #(a⊃↔a*)$ Reflexive Transitive Closure
These operations give us enough power to express the two facts
given in the first paragraph of this introduction; they are
respectively [Y:=X+5]<Y:=Y-1*>Y=X and P⊃⊃[a]<a⊃π>P. Neither Hoare's
nor Manna and Waldinger's formalism allow us to express either of
these concepts, since each permits only one [a] or one <a> and forbids
alternating [a] with <b>. Obviously they can be expressed in a
language that permits quantification over states (as done in [2]), but
this represents a larger step away from first-order logic than we have
taken here. More generally, this language subsumes Hoare's P{a}Q
notation, which we can express as α(P⊃⊃[a]Q). Similarly α(P⊃⊃<a>Q)
expresses the fact that whenever P holds when program a starts, a can
eventually reach a final state in which Q holds. This captures nicely
the notion of termination, allowing us to do for Manna and Waldinger
[9] what Hoare [5] did for Floyd [3], namely to capture for text
programs what had been done for flowcharts. Of course termination is
only one facet of our language; neither of the two above examples fits
either a partial correctness or a total correctness framework, since
they alternate boxes with diamonds in a way not possible in an
ordinary total correctness formalism. (The first example loses a
little impact due to the fact that [Y:=X+5]P is equivalent to
<Y:=X+5>P, but this equivalence is only guaranteed for programs that
are total functions on states; thus a more complex example would
dispose of this objection.)
For the moment we shall ignore the third and fourth of these
operations, and focus on "straight-line" programs, the term we shall
use for programs synthesized from tests and assignments using union
and composition. This coincides with the usual notion of straight
line program inasmuch as we can express, say, "if P then a else b" as
(P?⊃↔a)∪(¬P?⊃↔b). Composition of course plays the role of ";". We
are now in a position to add further axioms and rules to the above
system, thereby extending its non-logical component.
[P?]Q ≡ P⊃⊃Q
[F(Sλ}):=T]P ≡ P' where P' denotes a fairly involved syntactic transformation of P given
in conjunction with Theorem 4 of [13]; P' = P↓FλT↔ for 0-ary F.
[a∪b]P ≡ [a]P⊃∧[b]P
[a⊃↔b]P ≡ [a][b]P
Interestingly (but fairly obviously, as demonstrated in [13]),
the axiom system with these four new axioms remains sound, complete
and effective. (It is possible to give further axioms to handle the
converse operation, still preserving soundness, completeness and
effectiveness. However we shall not make use of this in the
following.)
.BLOCK 5
AN UNCONVENTIONAL AXIOM SYSTEM
We could at this point proceed with the discussion of our
ultimate objective, the construction of a proof-checking program that
would check proofs expressed in the above axiom system. Unfortunately
the above system is too weak to permit reasonably succinct proofs; for
example, it appears that 6 lines are needed to prove <X:=1>X=1 from
the assumption 1=1 using the above system. In this section we explore
a reformulation of the system with an eye on strengthening the axioms
and rules. In this respect we are emulating J.A. Robinson [15], who
prescribed a new axiom system to facilitate the construction of
automatic theorem-provers. The constraints on a proof-checker are
somewhat different from those of a theorem-prover, and the arguments
for Robinson's resolution rule is not sufficiently compelling for us.
In particular, the convenience of having a clause as the unit of
information, which helps an automatic theorem prover organize the
proof, may be more hindrance than help in a proof-checker because the
user may not have conceived his proof in terms of clauses that are
disjunctions of literals. This is not to say that we shall not make
use of unification; indeed, unification is a most valuable tool in
automated logic.
We propose to replace all of the above axioms and rules by a
single rule. This rule lacks the simplicity of the resolution rule;
however, it has the two advantages that it imposes no contraints (such
as conjunctive normal form) on the formulae referred to in each step
of the proof, and that it functions positively rather than negatively;
instead of assuming the negation of the theorem to be proved and
manipulating what hopefully will turn out to be an inconsistent set of
clauses, the user derives at each step a fact that follows from
assumptions he has made. This should permit the proofs to be more
natural, both in the ordinary sense of "natural" and (applying the
deduction theorem) in Gentzen's. The rule we envisage is sufficiently
complex that it is at once a drawback and an advantage that it will
not always be possible for the casual reader of a proof to tell
whether a given step is valid. We leave to the user the task of
formulating his proof so that a fellow human can follow it. We
envisage however that a major application of this program (or more
realistically, programs based on it) will be in supplying
machine-certified software. This has been a dream of computer
scientists for sufficiently long that for many it has taken on the
form of a pipe dream. We confess to some optimism about this dream;
we foresee a situation arising in which, using a program like ours, an
experienced programmer will be able to generate program and proof with
perhaps three times the effort he would expend on the program alone.
If this magic number of three is near the mark, then taking into
account the cost of debugging and maintaining software, which by
present estimates exceeds by a factor of five or so the cost of
writing the program, this approach to reliable software may be of
considerable economic significance.
We now give the details of the rule, which we call the Show
Rule for lack of a more descriptive term. A proof step using it looks
like
SHOW S USING T0, T1, T2, ...
Ideally, one would like this rule to apply whenever T0, T1,
T2, ... logically entail S; unfortunately, that would lead to a
non-effective proof-checker, since logical entailment is not even
partially decidable for our language; in fact, it is as hard to decide
as arithmetic truth [14]. We shall make do with a less appealing,
more complex, and operationally specified semantics for this rule.
The rule is defined by the algorithm that checks it. (One
might say this of Modus Ponens, that it is just a (trivial) algorithm
for checking a proof step.) The algorithm has three phases, which we
shall for convenience label as EXPAND, IDENTIFY and VERIFY. The
EXPAND phase first negates S, with the idea of showing that ¬S is
inconsistent with the T's. Then it converts the negated S and the T's
to "monotonic" normal form, in which all logical connectives are
converted to ⊃∧'s, ∨'s and ¬'s, and the ¬'s are pushed "down" to the
atomic formulae. Thus the only operators whose arguments can contain
literals are ⊃∧'s, ∨'s, [a]'s and <a>'s. These four operators are all
monotonic ([a] and <a> amount to a generalized ⊃∧ and ∨ respectively).
Then EXPAND eliminates as many modalities as it can, for example by
applying the axioms for union and assignment, Skolemizing existential
quantifiers, etc. Past this stage, any remaining uneliminated
modalities are regarded as uninterpreted. Discarding information in
this way may make an inconsistent set of formulae consistent, but not
conversely. After all, we are concerned only about the soundness of
the Show Rule, not about whether it can make giant logical leaps. (We
do care whether a valid formula can eventually be proved by repeated
application of the Show Rule, along with any necessary non-logical
axioms, and we make the claim that in this sense the Show Rule is
complete, which follows from the evident fact that it subsumes each of
the axioms and rules of the preceding section.)
Note that in all of this, EXPAND has made no attempt to put the
formulae into prenex normal form. The problem is that quantifiers do
not always commute with other modalities, and the non-quantifier
modalities do not move "up" the expression as freely as do
quantifiers. Fortunately prenex normal form is really only a luxury,
so in dispensing with it we do not suffer undue hardship.
The IDENTIFY phase's objective is to rename the literals as
simply propositional letters, in the process assigning the same letter
to as many literals as it can without converting a consistent set of
formulae into an inconsistent set. It does this by attempting to
unify pairs of literals occurring in the formulae. The details of
what the unification algorithm regards as variables are omitted here,
although to a first approximation these are the universally quantified
zeroary function symbols. (Assignments by programs to such symbols
introduce complications.) In partial compensation for the lack of
prenex normal form, IDENTIFY only unifies those literals that have the
same sequence of modalities (ignoring the distinction between [a] and
<a> but treating [a] and [b] as different) above them, along the path
from the literal to the root (thinking of an expression as a tree).
This restriction on what is to be identified is possible because the
VERIFY phase will not be able to make use of any unifications that are
made between literals that are "accessed" from the root via different
sequences of programs. IDENTIFY will put [a]-type modalities "on top
of" any of T0, T1, T2, ... (but not the negated S) if necessary to
identify two literals. This encompasses the second rule of the above
axiom system (the modal rule of necessitation) in a simple but general
way. When IDENTIFY has identified all pairs of literals that are
unifiable, it then proceeds to replace them with letters, using the
same letter for all literals that are unifiable. Where three or more
literals do not have the same unifier, e.g. P(X), P(0) and P(1),
or when different [a]'s need to be put on top of a given T, copies of
formulae are made (in the example, an extra copy of the formula
containing P(X) is made) so that every combination can be realized.
(As this is a proof-checker and not a theorem-prover, the user has
substantial control over the amount of work entailed here, and by
suitably structuring his proof, including where necessary or desirable
the use of introduced zeroary predicate symbols to stand for large
formulae, can make the Show Rule's task easier than is suggested by
the potentially explosive operations we have described, which may
yield O(n2↓) literals from O(n).) Finally it assembles the formulae
into one expression by taking their conjunction.
The resulting formula of propositional modal logic is now
given to VERIFY to check that it is not satisfiable, a decidable
question. VERIFY works by converting its input to disjunctive normal
form, which has its usual definition given that modal formulae are
treated as literals; each modality's argument is recursively reduced
to disjunctive normal form. Boxes are then eliminated from the
formula by the appropriate generalization of the transformation
<a>P⊃∧[a]Q -> <a>(P⊃∧Q), which preserves satisfiability for the
intuitively obvious reason that [a]Q acts only as a constraint on
those worlds one might construct (in attempting to satisfy <a>P) that
are accessible via a and satisfy P, namely that in any such world Q
must be true. The main reason for the choice of disjunctive normal
form is that it simplifies the general case of this transformation
considerably. With all boxes eliminated, the satisfiability of the
result no longer depends on the interpretations of the diamonds;
indeed, satisfiability of the whole is preserved if <a>P is replaced
by TRUE when P is satisfiable and FALSE when not. Since
satisfiability of disjunctive normal form formulae is trivial to
decide (just apply a few straightforward simplifications - the formula
is satisfiable if and only if the result is not FALSE), we can proceed
recursively, working up from the lowest diamonds.
A typical simple application of what may seem at first a
technique rather remote from the everyday world of reasoning about
programs is the problem of deducing that if a makes P hold and if
whenever P holds then b makes Q hold, then a⊃↔b makes Q hold. We can
rephrase this as the problem of deducing [a⊃↔b]Q from [a]P and
P⊃⊃[b]Q, where both a and b are uninterpreted (a situation that arises
whenever one has reached a point in the proof where one does not want
to look at the code of a or b again but wants merely to use already
proved facts about a and b). First we negate [a⊃↔b]Q to produce
<a⊃↔b>¬Q in monotonic (in fact disjunctive) normal form. Then we
eliminate ⊃↔ yielding <a><b>¬Q. The rest of EXPAND and IDENTIFY is
obvious and trivial, except that in order to identify the two
occurrences of the literal Q (or of P for that matter) we need to
rewrite ¬P∨[b]Q as [a](¬P∨[b]Q). The formula given to VERIFY is then
<a><b>¬Q ⊃∧ [a]P ⊃∧ [a](¬P∨[b]Q). Eliminating boxes proceeds thus:
<a>(<b>¬Q ⊃∧ P ⊃∧ (¬P∨[b]Q)), then (renormalizing) <a>(<b>¬Q⊃∧P⊃∧¬P ∨
<b>¬Q⊃∧P⊃∧[b]Q), then (simplifying) <a>(<b>¬Q⊃∧P⊃∧[b]Q), then
<a>(<b>(¬Q⊃∧Q)⊃∧P), which boils down to <a><b>FALSE, that is, it is
possible to execute a followed by b to reach a world where FALSE is
true. This unsatisfiable formula shows that ¬S is inconsistent with
the hypotheses, and so S follows as claimed.
Though this proof has the reductio-ad-absurdum flavor of a
resolution proof, recall that the user is not concerned with how SHOW
works internally; as far as he is concerned each step of the proof
proceeds in a natural, positive fashion. This is in contrast to a
resolution proof, where the theorem to be proved is negated at the
outset and the whole proof works towards deducing an inconsistency.
Thus a proof in our system should be more natural than a resolution
proof, provided the reader of one of our proofs is willing to take the
word of the Show Rule at each step. In a carefully worked out proof
this need not be an unreasonable request.
The amount of work the Show Rule has to go through to deduce
what seems an intuitively obvious fact about programs emphasizes that
the intuitively obvious need not be obvious to an algorithm that has
not been explicitly told a fact (such as gcd(55,89)=1, which is
"intuitively obvious" to anyone who recognizes that 55 and 89 are
consecutive Fibonacci numbers) but has to deduce that fact a hard
way. On other facts about programs that the Show Rule would
experience no more difficulty with, a person may have considerably
more difficulty. For example, we remarked earlier that P⊃⊃[a]<a⊃π>P
(that is, a⊃π can undo the effect of a) was "self-evident." However,
simply taking the contrapositive (A⊃⊃B -> ¬B⊃⊃¬A) of this fact,
itself a "self-evident" truth-preserving transformation, yields the
far less obvious fact <a>[a⊃π]P⊃⊃P, which says that if a can take you
to a state where running a backwards is guaranteed to make P true,
then P must have been true to begin with. The difference between the
human and the machine is that the human has heuristically learnt to
cope with frequently arising cases, while the machine has a commitment
to completeness. In addition to its generally applicable rules, the
machine can of course also have frequently arising cases catered for
especially; however, we do not at present have any data on what those
cases ought to be. Moreover, we expect the user to organize his
proofs so that even though the proof checker may have a fair amount of
work to do to check each step (as in the above worked example), the
workload should not be excessive for the computer; after all, at a
million instructions per second, what may seem to us long-winded may
still only take milliseconds on the computer.
AXIOMS FOR PROGRAMS WITH LOOPS
If straight-line programs were all that could be proved
correct in our system, it would find relatively little application.
The beauty of axiom systems is that they are modular; one can add on
axioms as they are needed. Although the one-rule system of the
previous section may appear to be just a program that implements a
proof-checker for the conventional axiom system of the previous
system, we were careful to treat this program as an inference rule,
the Show Rule. In this way the program can be just a component of a
more powerful system. For example, with programs that deal with
arithmetic one would like a variety of axioms and rules relevant to
arithmetic. And if programs may have loops then one would like rules
for these also. In the latter case we propose to have the following
axioms and rules.
<an↔>P ⊃⊃ <a*>P Axioms of Intent (one for each n ≥ 0).
P⊃⊃[a]P ⊃
P⊃⊃[a*]P Rule of Invariance.
P↓NλN+1↔⊃⊃<a>P ⊃
P⊃⊃<a*>P↓Nλ0↔ Rule of Convergence
These axioms and rules are explained and justified in more
detail in [13].
The reader familiar with the considerable amount of on-going
work in automatic program verification may see reflected in our
proposed structuring of the axiom system the general philosophy of
"cut the loops with user-supplied assertions and then let the verifier
work on the resulting straight-line program fragments." This
philosophy represents a step from purely automatic theorem-proving
towards proof-checking. Conversely, our idea of the extremely
powerful Show Rule represents a step from pure proof-checking towards
automatic theorem-proving. These observations support the idea that
while fully automatic program verification is too hard for computers,
generating naive proofs is too hard for humans. These two constraints
are likely to bring the theorem-provers and the proof-checkers closer
together, until perhaps they find that they are solving the same
problem.
.BLOCK 5
DESIGN OF A PROOF-CHECKER
Our objective is to construct a system for checking proofs of
formulae in the language described in the introduction. In this we
are following in the footsteps of Milner [11,12,17], who is doing for Scott's
Logic of Computable Functions what we propose to do for the above
modal extension to first-order logic. Another system related to ours
is Richard Weyhrauch's [1,16] FOL (First-Order Logic) proof-checker. The
most significant detail in which our program will differ from theirs
(apart from the obvious one of choice of logical language) is that our
program will make less of an effort to help the user interactively
than is done by either LCF or FOL, but rather will be, in the
beginning at least, a system in which the user prepares his proof
exactly as though he were writing a program. This means that his
proof will exist on a file and will be read by the proof-checker just
as an interpreter reads a program from a file. This should permit us
to focus all of our effort on the proof-checker proper. In this
regard it is worth remarking that MACLISP (the dialect of LISP in
which we shall implement the proof-checker) offers a general-purpose
syntactic preprocessor that will allow us to avoid any significant
diversion of our energies to the unrewarding task of writing a parser
for the logical notation used in the proofs.
The details of the construction of the proof-checker are
tedious and, beyond what we have already said in the previous
sections, not significant theoretically. The checker will keep track
of what formulae depend for their proof on what other formulae. If P
depends on Q, which later is proved from R, then P will come to depend
on R. With this sort of flexibility, proofs can be written in any
order, so that the user can use either forward or backward reasoning
and write down the steps of the proof in the corresponding order.
Thus if the final goal is P and the user can see that, "chaining
backwards," the next to last step must be Q, then the first step in
his proof might be to infer P from Q. The next step might be to infer
Q from, say, R. These two steps could have been given in either
order. Globally, the checker sees the proof as a directed acyclic
graph whose vertices are formulae and whose edges are immediate
dependencies. A proof is a list of edges, given in any order. The
proof-checker's task is to be able to say, for any vertex of the
graph, what "inputs" (vertices with no edges leading to them) have
paths leading to that vertex. As an additional book-keeping task, the
checker might assign sources to each vertex, e.g. one vertex might be
labelled "axiom", another might be labelled "proved in CACM reference
so-and-so"; if all inputs leading to a vertex v are axioms, the
checker might say "I believe v." These low-level details of the
organization of the proof-checker are obvious and are not worth
dwelling on in this proposal.
When the proof-checker is built, we propose to write proofs
for a variety of programs as an approach to evaluating the
effectiveness of this method of program certification. In particular,
we are interested in the relative times it takes to generate proofs as
opposed to programs. Without such a proof checker, it is hard to
carry this out by hand. First, we do not yet have a very good idea of
what size logical steps the Show Rule can take. We are optimistic
about being able to handle formulae containing three or four ⊃∪'s and
⊃↔'s at a time, which considering how many such formulae there are
represents a considerable improvement over the original sparse system.
We will be pleased if it handles bigger formulae, but not particularly
disappointed if it does not. A more serious concern is whether we
will be able to learn the limitations of the Show Rule sufficiently
well that we will be able to reliably write proof steps that follow
from the Show Rule. This is a big disadvantage of having such a
complicated rule; the main thing in its defense is that if you ask the
Show Rule to take steps each of which corresponds to a step in the
original axiom system, then the Show Rule will always apply and so you
are at least no worse off with the Show Rule than with the original
axiom system. However, we conjecture that the limits of the rule are
sufficiently far from this extreme that one will be able to take
considerably larger steps without too much fear of over-stepping the
limits of the Show Rule. In any event, we are proposing to conduct
these experiments so as to be able to pronounce on the efficacy of
this approach more confidently than we are able to in this proposal.
.FILL
.BLOCK 5
BIBLIOGRAPHY
[1] Aiello, M. and R. W. Weyhrauch, Checking Proofs in the
Metamathematics of First Order Logic. Computer Science Dept, Stanford
University, August, 1974. (50 pages).
[2] de Bakker, J.W., and D. Scott. An outline of a theory
of programs. Unpublished manuscript, 1969.
[3] Floyd, R.W. Assigning Meanings to Programs. In →Mathematical
Aspects of Computer Science_ (ed. J.T. Schwartz), 19-32, 1967.
[4] Hitchcock, P. and D. Park. Induction Rules and Termination
Proofs. In →Automata, Languages and Programming_ (ed. Nivat, M.),
IRIA. North-Holland, 1973.
[5] Hoare, C.A.R. An Axiomatic Basis for Computer
Programming. CACM →12_, 576-580, 1969.
[6] Hughes, G.E. and M.J. Cresswell. →An Introduction to
Modal Logic_. London: Methuen and Co Ltd. 1972.
[7] Kripke, S. Semantical considerations on Modal Logic.
Acta Philosophica Fennica, 83-94, 1963.
[8] Manna, Z. and A. Pnueli. Axiomatic Approach to Total
Correctness of Programs. Acta Informatica, 3λ←, 253-263, 1974.
[9] -----, and R. Waldinger. Is "sometime" sometimes better
than "always"? Intermittent assertions in proving program
correctness. Proc. 2nd Int. Conf. on Software Engineering, Oct. 1976.
[10] Mendelson, E. →Introduction to Mathematical Logic_. Van
Nostrand, N.Y. 1964.
[11] Milner, R.G. Implementation and Applications of Scott's Logic
for Computable Functions. Proc. ACM Conf. on Proving Assertions about
Programs, (SIGPLAN Notices, →7_, 1; SIGACT News, 14), 1-6. Las
Cruces, NM, Jan. 1972.
[12] Milner, R., L. Morris and M. Newey, A Logic for Computable
Functions with Reflexive and Polymorphis Types, University of
Edinburgh, LCF Report No. 1, January, 1975. (25 pages).
[13] Pratt, V.R. Semantical Considerations on Floyd-Hoare
Logic. Proc. 17th Ann. IEEE Symp. on Foundations of Comp. Sci.,
109-121. 1976.
[14] Harel, D., A.R. Meyer and V.R. Pratt. Computability and
Completeness in Logics of Programs. Proc. 9th Ann. ACM (SIGACT)
Symposium on Theory of Computing, Boulder, CO, 1977.
[15] Robinson, J.A. A Machine-oriented Logic Based on the
Resolution Principle. J. ACM →12_, 1, 23-41. Jan. 1965.
[16] Weyhrauch, R. W., and A. J. Thomas, FOL: a Proof Checker for
First-order Logic. Computer Science Dept, Stanford University,
AIM-235, September, 1974. (55 pages).
[17] Weyhrauch, R., and R. Milner, Program Semantics and
Correctness in a Mechanized Logic. First USA-Japan Computer
Conference, 1972. (9 pages).